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Abbreviations 

6-dEB: 6-deoxyerythronoIide B 

ACP: acyl carrier protein 

AT: acyltransferase 

DEBS: deoxyerythronolideB synthase 

DH: dehydratase 

ER: enoylreductase 

KR: ketoreductase 

KS: ketosynthase 

NAC: N-acetylcysteamine 

NRPS: nonribosomal peptide synthetase 

PCP: peptidyl carrier protein 

PKS: polyketide synthase 




Modular polyketide synthases (PKSs) are multienzyme assemblies responsible for the 
biosynthesis of numerous pharmacologically relevant natural products including the antibiotic 
erythromycin and the immunosuppressant FK506. As shown in the schematic diagram of the 
6-deoxyerythronolide B synthase (DEBS) in figure 1, the active sites of these enzymes are 
organized into distinct modules, each of which is responsible for elongating the polyketide 

01 chain by one ketide unit through the coordinated action of the three core active sites — a 

O 

y ketosynthase (KS), an acyltransferase (AT), and an acyl carrier protein (ACP). In addition to 

gi 

!ij these three core active sites, there are a variable number of postcondensational active sites 

m 

fQ within each module - including a ketoreductase (KR), a dehydratase (DH), and an 

s 

O enoylreductase (ER) - that generate structural diversity in the final product. The growing 

U 

S polyketide chain is processively elongated as it passes through each of the modules in an 

£ 

O assembly line fashion such that the number of extensions is dictated by the number of 



fy 



modules in the enzyme system. The choices of building blocks made by each module and the 
number and types of domains within each module catalyzing postcondensation reactions 
dictate the chemical functionality at each carbon atom in the final product 

The unique organization of modular PKSs and the transparency of the functional code offer 
tremendous potential for the use of these enzyme systems as a scaffold for the generation of 
novel small molecules through combinatorial biosynthesis. Of all possible strategies for 
generating new natural product-like molecules, the fusion of intact modules from different 
sources (also referred to as "module swapping") presents one of the most appealing methods 
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of generating new compounds. According to this strategy, since each module controls the 
functionality and stereochemistry of two adjacent carbon atoms, novel compounds can be 
generated by simply rearranging the order of modules along the assembly line. While there 
are a few examples of successful use of this strategy (7-3), it is still not clear what factors are 
important in mediating intermodular transfer, and how much of a role each factor plays. 



It has been previously shown that while individual modules of DEBS have inherent 
specificities for small molecule substrates, these modules are still tolerant of a wide variety of 

m stereochemical variation in the substrate, and their inherent small molecule specificities do not 

01 

|=£b appear to be limiting in intermodular transfer. (3) By using a novel assay system in which 

SS 

U i small molecule diketide substrates were covalently attached to donor ACP proteins as shown 

C8 

B in figure 2A, comparisons of kinetic parameters of donor protein-assisted substrate loading 

o 

|£j (figure 2B) versus diffusive substrate loading (figure 2C) were made. These experiments 

45 

g indicated that a channeling mechanism not only increases the specificity of a module for a 



ry 



particular substrate by approximately 3-4 orders of magnitude, but can also facilitate 
incorporation of otherwise poor substrates.(3, 4) 

Linker regions at the N- and C-tennini of each polypeptide interface (shown as matching tabs 
in figure 1) have been previously identified as important factors for mediating specific 
channeling between polypeptides. Consisting of approximately 30-90 hypervariable residues, 
these linker regions have been suggested to form coiled-coils and have been shown to interact 
pairwise and specifically with each other (i.e., the C-terminal linker of module 2 interacts 
specifically with the N-tenninal linker of module 3, and the C-terminal linker of module 4 
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interacts specifically with the N-terminal linker of module 5).(i, 5) While the importance of 
these linker regions in mediating intermodular specificity has been demonstrated, other 
interpolypeptide interactions have not been ruled out. The most likely candidate for relevant 
intermodular interactions is the interface between the ACP domain of the upstream module 
and the KS domain of the downstream module, since these two active sites are involved in 
forming the tetrahedral intermediate of the trans-thioesterification reaction as the polyketide 
intermediate is channeled from one module to the next 

m 
s 

W To evaluate the relative contributions of the linker interactions and the donor ACP-acceptor 

SI 

fl KS interactions, we used the assay system illustrated in figure 2B. All donor proteins wore 

■*| . 

Iff 

01 loaded with the same (2S, 3i?>2-methyl-3-hydroxypentanoyl thioester (hereafter referred to as 

Q "diketide"), which is derived from 2 and which has been shown to be a good substrate for 

U 

0 DEBS modules 2, 3, 5, and 6.(4) Kinetic parameters relating to the substrate transfer, 
O elongation, and release were measured in the presence of different combinations of donor 

ACP's, acceptor modules, and linkers. From these data, a distinct pattern emerged, providing 
the framework for basic ground rules for engineering novel PKSs by module swapping. 



Materials and Methods 



Nomenclature. The nomenclature used in this report for proteins containing linker regions is 
identical to that used previously.(3, 5) Specifically, the module of origin of the linker is 
placed in parentheses either before or after the name of the domain or module to which it is 
attached, depending on whether it is an N-or a C-terminal linker, respectively. The 



-6- 



boundaries of ACP domains, KS domains, and linkers are defined as before.(7, 5) For a 
protein whose linker region has been deleted, a null set symbol (0) is placed in the 
parentheses. Accordingly, module 6 that has been engineered with the N-terminal linker from 
module 5 is represented as (5)M6; likewise, ACP2 with no linker regions is represented as 
ACP2(0). If a thioesterase domain is fused to the C-tenninal end of a module, it is indicated 
as such (e.g. (5)M5+TE). 

Reagents and Chemicals. DI^[2-7we%/- 14 C]Meth>dmalonyl-CoA (56 mCi/mmol) was 



O 

y purchased from ARC, Inc. All other chemicals were purchased from Sigma-Aldrich. Buffer 

m 

M A: 100 mM NaH 2 P0 4 , 2.5 mM DTT, 1 mM EDTA, 20% glycerol, pH 7.1. Buffer B: 100 mM 

g NaH 2 P0 4 , 10 mM imidazole, 1 M NaCl, 20% glycerol, pH 8.0, Buffer C: 400 mM NaH 2 P0 4 , 

L 1 mM EDTA, 2.5 mM DTT, 20% glycerol, pH 7.1. 



O Construction of Plasmids. The construction of genes encoding (5)M2+TE, (3)M3+TE, 



(5)M5+TE, and (5)M6+TE (pRSG64, pRSG34, pRSG46, and pRSG54, respectively)(i); 
(5)M3+TE (pST132) (5); ACP4(4) (pNW8) (3); eryLDD (pJL636) (<5); andNovH(0) (7) 
have been previously described. (3)M5+TE encodes a derivative of DEBS module 5 in which 
its natural N-terminal linker has been replaced with the N-terminal linker from module 3. The 
N-tenninal linker of module 3 was excised from pRSG34 (7) (which encodes (3)M3+TE) as 
an Ndel-BsaBl fragment. The resulting fragment was used to replace the corresponding Ndel- 
BsdBl fragment in pRSG45, which encodes (5)M5+TE,(i) to yield pST133. ACP2(2) 
encodes the ACP domain of DEBS module 2 through its natural stop codon. This sequence 
was extracted from the gene cluster as an Ndel-EcoRI fragment hy PCR using the primers 5'- 




S 
M 
P 



ft! 
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C CAT ATG CTG CGC GAC CGG CTG-3' and 5'-GAA TTC TCA ATC GCC GTC 
GAG CTC C-3\ ACP2(4) encodes the ACP domain of DEBS module 2 with its natural C- 
tenninal linker replaced with the corresponding linker from module 4 using an engineered 
Spel site at the junction. The ACP domain was obtained as an Ndel-Spel fragment by PCR 
using the primers 5'-C CAT ATG GTG GTC GAC CGG CTC G-3' and 5'-ACT AGT 
GAG GAA ACC GGC GAC CG-3* (sequences complementary to DEBS shown in bold). 
Generation of the C-terminal linker region as an Spel-EcdRI fragment by PGR has been 
g previously described^ 5) These two fragments were cloned into pET28a to give pNWl 9. 
jjj ACP2(0) and ACP4(0) encode the ACP domain of DEBS module 2 and module 4, 

respectively, with stop codons engineered at the ©ad of the regions of homology. The coding 
§3 regions were obtained as Ndel-EcdBX fragments by PGR using the primers 5'-C CAT ATG 
Q CTG CGC GAC CGG CTG-3' and 5'-GAA TTC TTA GCC GAG CTC GGC GTC-3' for 

y 

jjj ACP2(0) and primers 5 '-C CAT ATG GTG GTC GAC CGG CTC G-3' and 5 '-GAA TTC 

Q 

pj TTA GAA CAG CCT GTC CCG CAG-3' for ACP4(0). The PCR products were cloned 
into pET28a to afford pNW6 (ACP2(2)), pNW7 (ACP2(0)) and pNW9 (ACP4(0)). 
NovH(4) encodes the adenylation (A) and peptidyl carrier protein (PCP) domains of the 
NovH open reading frame (ORF) from the novobiocin pathway.(7) It was fused to the C- 
tenninal linker of module 4 of DEBS as follows. DNA encoding NovH was derived from 
pHC10(7) as an Ndel-Xhol fragment. The linker region was obtained as an Xho\-Bpull02\ 
fragment using the primers 5 '-CTG CTC GAG AGG CTG TTC GCG GCC TCA-3 ' and 5 
C CCG CTG AGC CTA CAG GTC CTC TCC CC-3\ These two fragments were cloned 
into pET28a to yield pNW35. 
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Expression and purification of individual modules. All previously characterized single 
modules were expressed and purified as previously described.^, 5) (3)M5+TE (pST133) was 
expressed using a slightly modified version of the protocol used for previously characterized 
individual modules(4 5). This protein was expressed in E. coli BAP1 (8) in which the sfp 
phosphopantetheinyl transferase gene from Bacillus subtilis (P)has been inserted into the 
chromosome. BAPl/pST133 cells were grown at 37°C in LB media with 100 mg/L of 
carbenicillin to an OD m = 0.5, at which point they were cooled to 22°C in a water bath and 

01 then induced with 0.7 mM IPTG for 12 hours. The cells were harvested by centrifugation, 

O 

y washed with 50 mM Tris/1 mM EDTA (pH 8), and then resuspended in disruption buffer (100 

pi 

q mM NaKfePCU (pH 7.2), 100 mM NaCl, 1.2 mM DTT, 1 .2 mM EDTA, 0.7 mM benzamidine, 

yf 

pg 1 mg/L pepstatin, 1 mg/mL leupeptin, and 15% glycerol) before lysis by French Press (2x). 

q After the cell debris was removed by centrifugation, the supernatant was treated with a 0. 1 % 

U 

p PEI precipitation followed by a 60% (NILO2SO4 precipitation for 2 hours. The resulting 
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(NH4)2S0 4 pellet was resuspended in buffer A (see Reagents and Chemicals section above for 
composition), flash frozen in liquid nitrogen, and stored at -80°C until ready for further 
purification. The crude protein was purified by FPLC on a hydrophobic butyl sepharose 
column followed by a Resource Q anion exchange column as previously described^, 5) to 
yield 10 mg/L culture of purified (3)M5+TE. 



Expression and purification of ACP and PCP proteins. Apo-ACP4(4) and apo-NovH(0) 
were expressed in the E. coli strain BL21(DE3) and purified as previously described.^, 7) 
Apo-ACP2(2), apo-ACP2(0), apo-ACP4(0), apo-ACP2(4), and apo-NovH(4) were obtained 
by overexpression of pNW6, pNW7, pNW9 and pNW19, respectively, in the £1 coli strain 
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f BL21(DE3). After growth in LB (50 mg/L kanamycin) at 37°C to OD 60 o = 0.5-0.7, the cells 
were cooled in a water bath to 22°C and then induced with 1 mM IPTG for 12 hours at 22°C. 
The cells were then harvested by centrifugation, washed with 50 mM Tris (pH 8), and then 
resuspended in buffer B before lysis by French Press (2x). The cell debris was cleared by 
centrifugation and the supernatant batch loaded onto Ni NTA-agarose (Qiagen) resin (4 mL/L 
culture) for 1 hour. The resin was loaded into a Flex-column (Kontes), washed with 10 
column volumes of 35 mM imidazole in buffer B (see Reagents and Chemicals section above 

CH for composition), and then the desired N-terminal His 6 -tagged proteins were eluted with 100 

O 

J«j mM imizadole in buffer B. The appropriate fractions were concentrated and the buffers were 

J^j exchanged to buffer A (see Reagents and Chemicals section above for composition) + 1.5 M 

81 

gg (NH4)2S0 4 in Centriprep spin columns (Amicon). Using an Akta FLPC system (Amersham 

a 

Q Pharmacia Biotech AB), the concentrated protein was loaded at 1 mL/min onto a XK 16/20 

W 

O . column packed with 30 mL Phenyl Sepharose High Performance resin and equilibrated with 
P the same buffer, A gradient from 1 M (NHL^SCU to 0 M (NH4) 2 S0 4 in buffer A was applied, 
resulting in the elution of the desired proteins between 150 mM and 0 mM (NHO2SO4. The 
appropriate fractions were concentrated and buffer exchanged to buffer A in Centriprep spin 
columns to yield approximately 6 mg/L of ACP2(2), 15 mg/L culture of ACP2(4), 5 mg/L 
culture of purified ACP2(0), and 3 mg/L culture of ACP4(0). These purified proteins were 
then flash frozen in liquid nitrogen and stored at -80°C. Expression and purification of apo- 
NovH(4) were performed under the same condition as described for the ACP proteins, except 
expression was induced with 0,1 mM IPTG at 15°C. These conditions yielded 25 mg/L 
culture of purified NovH(4) The masses of these proteins were confirmed by ESI-MS or 
MALDI-MS. The parent masses of the proteins were found in all cases. Mass peaks 178 
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^ daltons less than the parent masses were found in some cases, corresponding to loss of N- 
tenninal N-formylmethionines. The apo-ACP2(0): observed mass = 12073 (parent mass) 
and 11895 (mass - formylmethionine), calculated mass = 12027. apo-ACP4(0): observed 
mass = 11917 (parent mass), calculated mass » 1 1901. apo-ACP2(2): observed mass = 20532 
(parent mass) and 20354 (mass - formylmethionine), calculated mass = 20495. apo-ACP2(4): 
observed mass = 20635 (parent mass) and 20457 (mass - formylmethionine), calculated mass 
= 20661 . apo-NovH(4): observed mass = 74502 (parent mass) and 74323 (mass - 
01 formylmethionine), calculated mass = 74626. 

6 

y 

01 

2~l Chemoenzymatic synthesis of diketide-ACP and diketide-PCP substrates* The apo-PCP 

m 

and apo-ACP proteins were converted to their respective diketide-ACP forms as previously 
p described and as shown in figure 2A.(3) Briefly, phosphopantetheinylation of each active site 
0 serine residue was catalyzed by sfp in the presence of 2, which was synthesized as previously 
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described. (3) The diketide-ACP/PCP substrates were either immediately used in the module 
substrate incorporation assays or used after purification by ion exchange chromatography. 
Purified protein concentrations were determined by Lowry assay. The masses as well as 
complete phosphopantetheinylation were confirmed by ESI-MS or MALDI-MS. A SDS- 
PAGE gel of the purified proteins is shown in figure 3. Representative mass spectral datum 
are shown in figure 4C, illustrating the purity and the complete conversion from the apo- 
ACPs. The parent masses of the proteins were found in all cases. Mass peaks 178 daltons 
less than the parent masses were found in some cases, corresponding to loss of N-terminal N- 
formylmethionines. All observed parent masses are within the 1% error range that is expected 
from the spectrometers. Diketide-ACP2(0): observed mass = 12528 (parent mass) and 12350 
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\ (mass - formyhnethionine), calculated mass = 12498. Diketide-ACP4(0): observed mass = 
12372 (parent mass), calculated mass = 12353. Diketide-ACP2(2): observed mass = 20809 
(parent mass) and 20988 (parent mass - formylmethionine), calculated mass = 20947. 
Diketide-ACP2(4): observed mass = 21089 (parent mass) and 20910 (parent mass - 
formylmethionine), calculated mass = 211 13. Diketide-NovH(4): observed mass = 74783 
(parent mass), calculated mass = 75078. 

Substrate transfer and elongation assays. Qualitative assays were performed with a 
diketide-ACP or diketide-PCP substrate either taken directly from the srfp 
phosphopantetheinlyation reaction or after further purification of the substrate. These assays 
were performed with 20 jiM diketide-ACP/PCP substrate for 2 hours in the following reaction 
conditions: 1 jjM acceptor module, 0.5 mM 14 C-methylmalonyl CoA, 4 mM NADPH in 
buffer C, 30°C. After quenching by addition of 250 \\L EtOAc and vortexing, the products 
were extracted with 2 x 250 pL EtOAc, resolved on a silica gel TLC plate, and visualized on a 
Packard Instantlmager. A representative TLC plate image is shown in figure 8B. 

Kinetic parameters were measured using purified diketide-ACP/PCP substrates in identical 
reactions conditions as described above for the qualitative assays, values were 
measured using 60 jjM diketide-ACP substrate. A representative time course is shown in 
figure 6E. These reactions were quenched by the addition of 80 12.5% SDS to 20 pL 
reaction mixture and immediate vortexing. The products were then extracted from the 
aqueous phase with 2 x 250 pL EtOAc. After removing the organic solvents in vacuo, the 
residual products were then spotted onto a TLC plate (Baker-flex 250 uM silica gel), resolved 
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in 60% EtOAc/40% hexanes, and the radioactive spots were visualized and quantified on a 
Packard Instantlmager. As previously described(3) 3 kcat/KM values were determined by 
competitive assay of the substrate of interest against a substrate with known kca/KM 
parameters- Representative liquid scintillation counting data is shown in figure 6F. 

Results 



Protein preparations. ACP4(4) (i.e., DEBS ACP4 with its natural C-terminal linker)(3) and 

q 

y eryLDD(0) (Le., the DEBS loading didomain with no C-terminal linker)(d) were constructed 

01 

^ and expressed as previously described. ACP2(2) includes the DEBS ACP2 domain and its 

m 

fQ natural C-terminal linker. The linker is defined as the sequence from the end of the ACP 
□ consensus sequence to the natural stop codon.(5) ACP2(4) was constructed as a fusion 

y 

O protein between ACP2 and the C-terminal linker of ACP4. ACP2(0) and ACP4(0) are 
O isolated ACP domains without linker regions. All proteins were expressed as N-tenninally 

His6~tagged apo proteins that could subsequently be purified by Ni-affinity chromatography to 
yield 6 mg/L culture of ACP2(2), 15 mg/L culture of ACP2(4), 5 mg/L culture of purified 
ACP2(0), 3 mg/L culture of ACP4(0), and 25 mg/L culture of NovH(4). These proteins 
were converted to diketide-ACPs and diketide-PCP substrates by phosphopantetheinylation 
with sjp in the presence of 2, as previously described.(3) An SDS-PAGE gel of the purified 
protein substrates is shown in figure 3. In addition, representative mass spectral datum of 
diketide-ACP2(2) and diketide-ACP2(4) are shown in figure 4C to demonstrate quantitative 
phosphopantetheinlyation by sjp. 
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( (5)M2+TE, (3)M3+TE, (5)M3+TE, (5)M5+TE, and (5)M6+TE were constructed and 
expressed as previously described^, 5) pST133 encodes (3)M5+TE which is a fusion 
protein of module 5 covalently attached to the thioesterase domain to facilitate turnover. In 
addition, the natural N-tenninal linker of module 5 is replaced with the N-terminal linker of 
module 3. Expression and purification of this protein was carried out according to the 
previously reported protocol.(4) 

m Analysis of the modularity of linker regions. The linker regions have previously been 

O 

y suggested to be modular, or functionally independent^/, 5) The kinetics of substrate transfer 

m 

m at the module 2-module 3 interface followed by elongation and product release were 

Si 

U1 examined as a function of the k 60p M and kca/K M values of the overall reaction. The k 6 ouM 

OS 

g values reported here represent the apparent overall rate of product formation at an initial 

hi 

q substrate concentration of 60 |iM. In many cases, the k^o^M values approximate the maximal 

J2 

O overall turnover rates, as determined by back-calculating the Km value for the reactions. True 

m 

saturation kinetics were not practical because of the technical limitations (e.g., solubility) and 
limited supply associated with high molecular weight substrates such as diketide-ACP and 
diketide-PCP. k cat /K M values were determined by competitive assay of the substrate of 
interest against a substrate with a known kca t /K M value, as previously described.(3) This 
method for determining kca t /K M values was chosen because it allowed us to conserve our 
limited supply of protein-based substrates compared with a direct measurement of the initial 
slope of a full v vs. [Sj plot A representative time course and liquid scintillation counting 
data used to determine k 60 pM values and k ca JKM values are shown in figures 6E and 6F, 
respectively. 



-14- 



la the first reaction, shown in figure 4A, diketide-ACP2 and module 3 with their natural linker 
regions manifest Jc 60 pM and kaJKM values of 1 .4 min" 1 and 390 min 1 mMT 1 , respectively. 
When the module 4-module 5 linker pairs are transplanted into the module 2-module 3 
interface as shown in figure 4B, the keopsa value remains approximately the same, but the 
kca/KM value decreases to 56 min W 1 . This comparison suggests that swapping out natural 
linker pairs for alternative linker pairs affects the Km value of the transfer and elongation 
reaction, but not the maximum rate. 

Analysis of the relative contributions of the donor ACP, acceptor KS and linkers to 
chain elongation. Various donor ACP-acceptor module pairs were examined for their ability 
to transfer substrates from the donor ACPs to the acceptor modules) which could then 
elongate and release triketide lactone product. Two sets of reactions were carried out - one in 
which the acceptor module was DEBS module 3 and the other in which the acceptor module 
was DEBS module 5. For each set of reactions, reactions were performed representing one of 
the following conditions: A) matched linkers and matched donor ACP-acceptor KS pairs, B) 
mismatched linkers and matched ACP-KS pairs, C) matched linkers and mismatched ACP- 
KS pairs, or D) mismatched linkers and mismatched ACP-KS pairs. As indicated by the 
formation of the expected triketide lactone product, transfer of diketide from the donor ACP 
to the acceptor module occurred at 20 |xM substrate concentration in the reactions shown in 
figures 5A-C and 6A-C. These successful reactions represent conditions A-C (as defined 
above), and their kinetic parameters were further investigated. In contrast, no product was 
detected at the same substrate concentrations from the reactions in figures 5D and 6D 
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(representing condition D), indicating that transfer did not occur in the presence of both 
mismatched linkers and ACP-KS pairs. This qualitative data indicates the diketide substrate 
can be transferred to module 3 or 5 as long as either the linkers are matched or the ACP-KS 
pairs are matched. 



In order to quantify the relative contributions of the linker pairs versus the ACP-KS pairs to 
the efficient channeling of substrates, and k cat /K.M values were measured for the reactions 
shown in figures 5A-C and 6A-C. The reactions of diketide-ACP2(2) + (3)M3+TE (figure 

m 

g 5 A) and diketide- ACP4(4) + (5)M5+TE (figure 6A) manifest values of 1 .4 min 1 and 9.3 
2 Hiin" 1 and kc at /K M values of 390 min 1 mM" 1 and 290 min 1 mM" 1 , respectively. In contrast to 
|fj these reactions comprising matched linkers and matched ACP-KS pairs, the reactions in 

m 

« which either the linkers are mismatched or the ACP-KS pairs are mismatched (but not both) 

O 

y manifest significant and similar decreases in catalytic efficiencies and specificities. While the 

P 

g ksotM and k C( JKM values for the mismatched reactions shown in figures 5B and 5C fell 
^ approximately 3-5 fold and 80-200 fold, respectively, the corresponding values for the 

mismatched reactions shown in figures 6B and 6C fell approximately 20-fold and 150-fold, 
respectively. These data suggest that for both module 3 and module 5, the linker interactions 
and the donor ACP-acceptor KS interactions play significant and approximately equal roles in 
the channeling of substrates between modules. 



Analysis of chain elongation by various acceptor modules in the presence of a linkerless 
ACP4. Linker interactions were eliminated entirely from the transfer and elongation assays 
in the reaction of linkerless diketide- ACP4(0) with (5)M2+TE, (5)M5+TE, (3)M5+TE, and 
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(5)M6+TE. Formation of the expected triketide lactone was observed from the reactions of 
diketide-ACP4(0) with (5)M5+TE and (3)M5+TE (figure 7A and 7B), both of which 
contained matched ACP-KS pairs. The reaction shown in figure 7A has faopM and kcat/K^ 



corresponding values of 0.27 mm" 1 and 2.5 min^mM" 1 , respectively. These values are 
comparable to those observed when the linkers are mismatched and the ACP-KS pairs are 
matched (figures 5B and 6B), indicating that, in this case, the presence of mismatched linkers 
CP and the deletion of complete linker pairs are kinetically equivalent. 



vy ACP4(0) was not able to efficiently transfer substrates to module 3, regardless of which N- 

m 

jS terminal linker was covalently fused to the module (figure 7C and 7D). This result was 

s 

O expected based on the above observation that channeling to module 3 is eliminated in the 

w 

O absence of both matched ACP-KS pairs and matched linker pairs. In contrast, transfer of 

=P 
Q 

pj diketide from ACP4(0) to modules 2 and 6 was observed (figures 7E and 7F, respectively), 
despite the elimination of linker interactions and the non-consecutive ACP-KS pairs. By 
comparison to the kinetics parameters for the same reaction catalyzed by modules 2 and 6 in 
the presence of matched linkers (figure 7G and 7H),(5) we note that the values drop 
approximately 10-fold and the kcat/K M values drop approximately 70-300 fold when the linker 
interactions are eliminated. These data suggest that modules 2 and 6 are weakly, but 
demonstrably more tolerant to unnatural donor proteins than modules 3 and 5. 

Tolerance of modules 2 and 6 for unnatural donor proteins. ACP2(0), eryLDD(0), 
NovH(0), and NovH(4) were examined as potential donor proteins for the transfer of diketide 




values of 0.49 mm 1 and 4.1 min^mM" 1 , respectively, and the reaction in figure 7B has 



O 

m 
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0^ to modules 2 and 6 (figure 8A; a radio-TLC image is shown in figure 8B). Reactions of these 
same ACPs and PCPs were also performed with (3)M3+TE, (5)M3+TE, (5)M5+TE, and 
(3)M5+TE. As predicted by previous experiments, substrate transfer from any of these 
linkerless donor proteins to module 3 or 5 was not observed (data not shown). In contrast, 
both the ACP domains (ACP2(0), eryLDD(0)) were able to channel the diketide substrate to 
both (5)M2+TE and (5)M6+TE, despite the absence of matched linker pairs. 

NovH(0} is an adenylation-peptidyl carrier protein (A-PCP) didomain involved in the 
biosynthesis of the coumarin ring of novobiocin.(7) This protein has no apparent C-terminal 
linker region as determined by sequence alignment and does not naturally interact with any 
known PKS domain in its role in novobiocin biosynthesis. In our assays, NovH(0) was not 
able to transfer the diketide substrate to either (5)M2+TE or (5)M6+TE without the benefit of 
linker interactions. However, interaction between the NRPS-derived donor protein and PKS 
modules could be induced by engineering the C-terminal linker from DEBS module 4 on to 
the C-terminal end of NovH to create NovH(4). With the benefit of matched linker pairs, 
NovH(4) was able to channel the diketide substrate to module 2 with a k 60p M value of 0.16 
min 1 and a kcat/K M value of 3.5 min W 1 and to module 6 with a k 60/J M value of 0.53 min 1 
and a k ca /K M value of 8.7 min" 1 mM~\ As the first demonstration of engineered interface 
involving the interaction of an NRPS domain that does not naturally interact with any PKS 
domains and a PKS domain that does not naturally interact with any NRPS domains, the 
experiment illustrates the power and utility of the linker regions for engineering artificial 
interpolypeptide junctions. 
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iscussion 



Understanding the factors that control the specificity of intennodular chain transfer is 
fundamental to the ability to rationally engineer novel polyketide synthases via module 
swapping. Among the factors to be considered are small molecule substrate specificity as 
well as protein-protein interactions between the donor and acceptor modules. It has been 
previously shown that while individual modules have defined specificities for small 
molecules, there is considerable tolerance toward less favored stereochemical 

Oft 

q configurations.^?) In addition, 30-90 residue linker regions at the N- and C-termini of the 

W 

fll bimodular polypeptides of DEBS have been identified and shown to contribute to the 

H» 

SI specificity of intermodular transfers between two proteins.(i> 5) While these linker regions 

HI 

® are potentially powerful tools for enhancing specificity at engineered intermodular junctions, 

Q 

y it is likely that other protein-protein interactions are involved in mediating the specificity of 

O 

£ chain transfer. One of the most plausible candidates for relevant protein-protein interactions 

Q 

fU is the interaction between the ACP domain of the donor module and the KS domain of the 

acceptor module. These two domains presumably dock together as the substrate is channeled 
from the ACP to the KS domain via a tetrahedral transition state; therefore, a certain degree of 
spatial proximity can be inferred, suggesting the existence and relevance of additional protein- 
protein interactions at the ACP-KS interface. 



Modularity of the linker regions is essential for their use in mediating unnatural interactions 
between modules from different sources. That is, engineering of the linker regions onto 
heterologous protein must be accompanied by a minimal kinetic penalty. To assess the 
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^^modularity of the two linker pairs from DEBS (i.e., the linker pair at the module 2-module 3 
interface and the linker pair at the module 4-module 5 interface), kinetic parameters 
describing the transfer from ACP2 to module 3 were determined for the two reactions in 
which each matched linker pair was inserted into the module 2-module 3 interface. 
Engineering of the heterologous module 4-module 5 linker pair into the module 2-module 3 
junction had no effect on the maximal rate of transfer and elongation as compared to the 
natural module 2-module 3 linker pairs (figure 4); this is consistent with a previous 

~ experiment examining the transfer over the same interface but using the full module 2 donor 

O 

yj protein.(5) However, replacing the natural linker pair with the heterologous linker pair 

m 

hk increases the K M for the ACP2-module 3 reaction by approximately 7-fold The contrast 

N 

between the uniformity of the k60pM term (which approximates the maximal rate) and the 
q variability of the K M term in the presence of different linker pairs suggests that swapping out 
q the natural module 2-module 3 linker pair for the alternate module 4-module 5 linker pair 
O perturbs only the initial association-dissociation of the ACP2 and module 3. When using the 

m 

full module 2 protein, the increase in K M value upon swapping in the alternate linker pair is a 
more modest 2-fold, suggesting the more significant K M effect when using isolated ACP2 may 
be an artifact of the truncated upstream protein. 



To identify and quantify the relative contributions of various protein-protein interactions 
involved in mediating substrate channeling, we have replaced the linkers on two donor ACP 
domains (ACP2 and ACP4) as well as corresponding acceptor modules in a modified version 
of the minim al donor ACP system that had been previously developed(3). In two independent 
data sets using the N-terminal modules 3 and 5 as the acceptor modules, baseline kinetics 
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^parameters were first measured for reactions comprising both matched linkers and 
consecutive ACP-KS domains (figures 5A and 6A). When either the linker regions or the 
donor ACP was swapped such that either the linker pairs or the ACP-KS domains were now 
mismatched, comparable attenuation of kinetic parameters was observed (figures 5B-C and 
6B-C), indicating that for modules 3 and 5, the ACP-KS interactions and linker interactions 
contribute comparably to the specificity of intermodular chain transfer. 



The reactions of linkerless ACP4 (i.e., ACP4(0)) with (5)M5+TE and (3)M5+TE (figures 7A 

m 

O and 7B) demonstrated comparable kinetic parameters to the reactions between ACP4 and 

U 

P module 5 comprising mismatched linkers (figure 6B). This indicates that eliminating linker 

!~ interactions through mismatched linkers is kinetically comparable to eliminating linker 

2 W mteractions through physical deletion of the region. Furthermore, the kinetic effects observed 

o 

y in the mismatched linker reactions are probably a result of the elimination of protein-protein 

O 

*P interactions rather than an artifact of protein engineering. 

O 

m 

Whereas the ICS domains of the N-terminal modules 3 and 5 are specific for their natural 
upstream ACP domains, the KS domains of the C-terminal modules 2 and 6 are promiscuous 
towards heterologous upstream ACP domains. ACP4(0) was observed to be capable of 
transferring substrates to both (5)M2+TE and (5)M6+TE, despite the absence of matched 
linker interactions (figure 7E and 7F). Kinetic analysis of these two reactions indicate that the 
attenuation of kinetic efficiency and specificity compared to the corresponding reactions 
comprising matched linkers (i.e., ACP4(4) + (5)M2+TE and ACP4(4) + (5)M6+TE) can be 
accounted for entirely by the elimination of linker interactions. The dichotomy between N- 
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I terminal modules (e.g., modules 3 and 5) and C-tenninal modules (e.g., modules 2 and 6) of 
DEBS can perhaps be rationalized in the context of their natural positions in the assembly 
line. N-terminal modules such as DEBS modules 3 and 5 naturally accept inco min g 
substrates from an upstream module on a different polypeptide. Therefore, built-in specificity 
for donor ACP domains would be highly advantageous for maintaining specific intermodular 
transfers. On the other hand, C-tenninal modules such as modules 2, 4, and 6 naturally accept 
incoming substrates from covalently attached upstream modules, making specificity between 
ffi the donor ACP and the acceptor module less essential, 

6 
w 
m 

f* The generality of the tolerance of modules 2 and 6 for unnatural donor ACP domains was 

m 

gg elaborated using the linkerless, heterologous ACP domains ACP2(0) and eryLDD(0). In all 

s 

p tested cases, channeling was observed even in the absence of matched linkers and consecutive 

u 

0 ACP-KS pairs (figure 8A). A natural extension of these observations is to explore the 

•P 

O tolerance of these KS domains for peptidyl carrier protein (PCP) domains derived from 

fJ 

nonribosomal peptide synthetases (NRPSs). While PCP and ACP domains share similar 
three-dimensional structural folds and are functionally analogous, the homology of PCPs to 
ACPs is generally relatively low (approximately 15-30%), contributing to very disparate 
surface polarities.(70, 77) There are also numerous examples of hybrid NRPS-PKS gene 
clusters in which PCP domains transfer substrates to KS domains.(72-2i) NovH comprises 
adenylation (A) and peptidyl carrier protein (PCP) domains and is involved in the formation 
of the coumarin ring in the biosynthesis of novobiocin. As there are no PKS genes in the 
novobiocin gene cluster, it is assumed that this A-PCP didomain does not naturally interact 
with any PKS proteins during novobiocin biosynthesis. While NovH(0) failed to channel 
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I substrates to (5)M2+TE or (5)M6+TE in the absence of matched linkers (figure 8A), 
interaction between NovH and modules 2 and 6 could be effected by engineering the C- 
tenninal linker of DEBS module 4 onto the end of NovH such that the resulting NovH(4) 
protein was capable of efficiently transferring substrates to modules 2 and 6. Although an 
artificial intrapolypeptide NRPS-PKS interface has previously been created by replacing the 
DEBS loading didomain with the rifamycin synthetase A-PCP loading didomain(5), the 
rifamycin A-PCP didomain naturally interacts with PKS domains on the same polypeptide, 
indicating that it may be inherently more amenable to engineering into alternate NRPS-PKS 
junctions. In contrast, this experiment with NovH(4) is to our knowledge the first example of 
engineering a functional NRPS-PKS interface involving an NRPS domain that does not 
M naturally interact with any PKS proteins and a PKS domain that does not naturally interact 

m 

yj with any NRPS proteins. While this experiment biases the transfer reaction by eliminating the 

s 

S small molecule recognition component of a true NRPS-PKS transfer, it indicates that the 

yj 

jp heterologous linker regions are sufficient for inducing interaction between two naturally non- 

Q 

jy interacting proteins and illustrates the potential of these linker regions for future engineering 
of artificial interpolypeptide junctions. 



m 



The aggregate of these data begins to provide basic ground rules for the development of novel 
polyketide synthases via module replacement. As mentioned above, it has been previously 
demonstrated that linker pairs can be powerful tools for creating specificity in artificial 
interpolypeptide junctions.^ 5) However, it is also essential to consider the origin of the 
modules in the engineered junction as well as the modules in any competing junctions. 
Whereas natural interpolypeptide junctions comprise a C-terminal module that channels 
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I substrates to an N-terminal module (represented as C -»N), artificial junctions should be 
designed to represent one of the other three combinations (N -> N, C -> C, or N -> Q in 
order to maximize specificity in the engineered assembly line. Practical module swapping 
experiments are underway to evaluate these proposals. In addition, complementary studies of 
intrapolypeptide junctions will provide additional and essential foundations for the 
development of modular PKSs as a scaffold for combinatorial biosynthesis. 
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Figure Captions 
Figure 1 

Deoxyerythionolide B synthase (DEBS) catalyzes the biosynthesis of 6-dEB (1), the aglycon 
precursor of the antibiotic erythromycin. DEBS is composed of three polypeptides - DEBS1, 
DEBS2, and DEBS3 - each of which comprises two modules for a total of six modules 
(modules 1-6). Individual catalytic domains are represented by circles, and linker regions are 
represented by solid tabs between DEBS1 and DEBS2 and between DEBS2 and DEBS3. 
Each module contains three core catalytic domains - ketosynthase (ELS), acyltransferase (AT), 
and acyl carrier protein (ACP) - as well as a variable number of optional domains - 
ketoreductase (KR), dehydratase (DH), and enoylreductase (ER). Polyketide biosynthesis is 
initiated by the action of the loading didomain (LDD) at the N-terminus of DEBS1, which 
primes the synthase with C 3 -subunit derived from propionyl CoA. Biosynthesis of 1 then 
proceeds in an assembly-line fashion such that the incoming polyketide chain is loaded onto 
the KS of an extending module from the ACP of the previous module. This is followed by a 
decarboxylase condensation reaction between the growing chain and a methylmalonyl- 
derived C3 extender unit that has been loaded onto the ACP by the AT. This C-C bond- 
forming reaction places the growing chain on the ACP, where it can then undergo unique 
functionalization catalyzed by KR, DH, and ER before being passed to the KS of the 
downstream module. This processive cycle of elongation and functionalization occurs until 
the penultimate intermediate reaches the thioesterase (TE), which catalyzes macrocyclization 
and product release to yield 1. 
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Kgure2 

A) Setup and mechanism for the intermodular transfer and elongation assay. Diketide-S-CoA 
is covalently attached to apo-ACP with the phosphopantetheinyl transferase Sfp to yield 
diketide-ACP. When this substrate is added to a module with complementary protein-protein 
interactions, the diketide is transferred to the KS of the acceptor module, where in the 
presence of methylmalonyl CoA extender units it will be elongated one time and cyclized to 
release the six-membered triketide lactone. B) The reaction of diketide- ACP4(4) + 
(5)M5+TB as previously reporte<L(3) In this system, the diketide is channeled from the donor 
ACP4(4) protein to the acceptor (5)M5-f-TE protein. C) The reaction of diketide-S-N- 
acetylcysteamine + (5)M5+TE as previously reported.^ In this reaction, the diketide is 
diffusively loaded onto the acceptor (5)M5+TE protein. 

Figure 3 

SDS-PAGE image of the purified protein substrates. Only proteins which have not been 
previously reported are shown. A protein ladder is shown in the left-most lane. Lane A: 
Diketide-ACP2(2). LaneB: Diketide-ACP2(4). Lane C: Diketide- ACP2(0). LaneD: 
Diketide-ACP4(0). Lane E: Diketide-NovH(4). 

Figure 4 
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^ The modularity of the linker regions in the ACP2-module 3 interface. A) The reaction of 
diketide-ACP2(2) + (3)M3+TE with the natural module 2-moduIe 3 linker pair. B) The 
reaction of diketide-ACP2(4) + (5)M3+TE with the alternate module 4-module 5 linker pair. 
C) Representative mass spectra of the purified diketide-ACP substrates showing purity as well 
as complete conversion from the apo-ACPs. Data for diketide-ACP2(2) and diketide- 
ACP2(4) are shown here. In both cases, the major peak corresponds to the parent mass minus 
177, indicating loss of N-formylmethionine. 
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p Figure 5 
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SI Schematic diagram and kinetic parameters of the four combinations of matched and 



m 



mismatched linker regions and matched and mismatched ACP-KS pairs with module 3 as the 
acceptor module. A) The reaction of diketide-ACP2(2) + (3)M3+TE with matched linkers 



O 

y 
o 

and matched ACP-KS pairs. B) The reaction of diketide-ACP2(4) + (3)M3+TE with 



fjj mismatched linkers and matched ACP-KS pairs. C) The reaction of diketide-ACP4(4) + 

(5)M3+TE with matched linkers and mismatched ACP-KS pairs. D) The reaction of diketide- 
ACP^) + (3)M34TE with mismatched linkers and mismatched ACP-KS pairs. 

Figure 6 

Schematic diagram and kinetic parameters of the four combinations of matched and 
mismatched linker regions and matched and mismatched ACP-KS pairs with module 5 as the 
acceptor module. A) The reaction of diketide-ACP4(4) 4- (5)M5+TE with matched linkers 
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^and matched ACP-KS pairs. B) The reaction of diketide-ACP4(4) + (3)M5+TE with 
mismatched linkers and matched ACP-KS pairs. C) The reaction of diketide-ACP2(4) + 
(5)M5+TE with matched linkers and mismatched ACP-KS pairs. D) The reaction of diketide- 
ACP2(2) + (5)M5+TE with mismatched linkers and mismatched ACP-KS pairs. E) 
Representative time course used to determine keojda values. The data here corresponds to the 
reaction of diketide-ACP4(4) + (3)M5+TE. All reactions were performed in duplicate to 
confirm reproducibility. F) Representative liquid scintillation counting data from the 
competitive assays used to determine kc a JK M values. The data shown here corresponds to the 

m 

53 reaction of 2 mM VDK-SNAC + 25 pM diketide-ACP4(4) + (3)M5+TE. The peak at 15 

y 

j^' minutes corresponds to the product derived from VDK-SNAC ((2S, 3i?)-2-methyl-3-hydroxy- 

SS 

y| 5^iV^acetylcysteainine)-heptanethioate), and the peak at 18 minute corresponds to the product 

jfrt 

m " derived from diketide-ACP4(4). By measuring the initial slope of a v vs. [S] plot, the K^/Km 
O 

y value for VDK-SNAC + (3)M5+TE was previously determined to be 0.078 min 1 mM" 1 (data 

O 

=C not shown). All reactions were performed in duplicate at different ratios of competing 

O 

substrates to confirm reproducibility. 
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Linker-less ACP4(0) as the donor protein. A) Diketide-ACP4(0) + (5)M5+TE. B) 
Diketide-ACP4(0) + (3)M5+TE. C) Diketide-ACP4(0) + (5)M3+TE. D)Diketide- 
ACP4(0) + (3)M3+TE. E) Diketide-ACP4(0) + (5)M2+TE. F) Diketide-ACP4(0) + 
(5)M6+TE. F)Diketide-ACP4(4) + (5)M2+TE, shown for reference.^) G)Diketide- 
ACP4(4) + (5)M6+TE > shown for reference.^) 
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Figure 8 

A) Qualitative assessment of the ability of various donor proteins to transfer diketide 
substrates to modules 2 and 6. In the columns, going left to right: diketide- ACP2(0), 
diketide-eryLDD(0), diketide-NovH(0), diketide~NovH(4). In the rows, going down: 
(5)M2+TE, (5)M6+TE. B) Representative radio-TLC image of qualitative assays. From left 
to right, the lanes correspond to the reactions of diketide-ACP2(0), diketide-eryLDD(0), 

01 

O diketide-NovH(0), and diketide-NovH(4) with (5)M2+TE. All reactions were performed 

y 

m under the conditions described in the Materials and Methods section. The heavy spots at the 

^ baseline correspond to methylmalonyl CoA and propionyl CoA (derived from 

$f| 

- decarboxylation of methylmalonyl CoA) that were adventitiously extracted into the organic 

0 

y layers. The spot at R f = 0.05 in the diketide-eryLDD(0) reaction was not identified. The 

O 

=g reactions of the same substrates with (5)M6+TE afforded similar raw data. 

6 
m 
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Claim: 

1 . A method for combining a first and second module wherein said first and second 
module originate from different polyketide synthase enzymes comprising: attaching said 
first module and said second module using cognate intermodular linkers and cognate acyl 
carrier protein and ketosynthase domains. 
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^Abstract 

6-Deoxyerythronolide B synthase (DEBS) is the modular polyketide synthase (PKS) 
responsible for the biosynthesis of 6-dEB, the aglycon core of the antibiotic erythromycin. 
The biosynthesis of 6-dEB proceeds in an assembly line fashion through the six modules of 
DEBS, each of which catalyzes a dedicate set of reactions, such that the structure of the final 
product is determined by the arrangement of modules along the assembly line. This 
transparent relationship between protein sequence and enzyme function is common to all 
modular PKSs and makes these enzymes an attractive scaffold for protein engineering 

m 

£3 

y through module swapping. One of the fundamental issues relating to module swapping that 

CP 

\m still needs to be addressed is the mechanism by which intermediates are channeled from one 

M 

IH module to the next. While it has been previously shown that short linker regions at the N- and 

m 

C-termini of adjacent polypeptides play an important role in mediating intermodular transfer, 
the contributions of other protein-protein interactions have not yet been probed. Here, we 



O 

y 

D 

% investigate the roles of the linker interactions as well as the interactions between the donor 



fy 



acyl carrier protein (ACP) domain and the downstream ketosynthase (KS) domain in various 
contexts. Linker interactions and ACP-KS interactions make relatively equal contributions at 
the module 2-module 3 and the module 4-module 5 interfaces in DEBS. In contrast, modules 
2 and 6 are more tolerant toward substrates presented by non-natural ACP domains. This 
tolerance was exploited for engineering hybrid PKS-PKS and PKS-NRPS (non-ribosomal 
peptide synthetase) junctions and suggests fundamental ground rules for engineering novel 
chimeric PKSs in the future. 
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